The Holy Grail of Sense Definition: Creating a Sense-Disambiguated Corpus from Scratch
نویسندگان
چکیده
This paper presents a methodology for creating a gold standard for sense definition using Amazon’s Mechanical Turk service. We demonstrate how this method can be used to create in a single step, quickly and cheaply, a lexicon of sense inventories and the corresponding sense-annotated lexical sample. We show the results obtained by this method for a sample verb and discuss how it can be improved to produce an exhaustive lexical resource. We then describe how such a resource can be used to further other semantic annotation efforts, using as an example the Generative Lexicon Mark-up Language (GLML) effort.
منابع مشابه
An Iterative Approach to Word Sense Disambiguation
In this paper, we present an iterative algorithm for Word Sense Disambiguation. It combines two sources of information: Word_Net and a semantic tagged corpus, for the purpose of identifying the correct sense of the words in a given text. It differs from other standard approaches in that the disambiguation process is performed in an iterative manner: starting from free text, a set of disambiguat...
متن کاملCase study of BushBank concept
In this paper, we present a new type of annotated corpus, called BushBank, which improves handling of ambiguity in natural language. Unlike in traditional approaches where data are directly disambiguated, in a BushBank, disambiguation is done later, based on application needs. This has major impact on the structures used in the corpus, since ordinary syntactic trees disallow ambiguity. Our appr...
متن کاملA Korean Homonym Disambiguation System Based on Statistical Model Using Weights
A homonym could be disambiguated by another words in the context as nouns, predicates used with the homonym. This paper using semantic information (co-occurrence data) obtained from definitions of part of speech (POS) tagged UMRD-S 1 ). In this research, we have analyzed the result of an experiment on a homonym disambiguation system based on statistical model, to which Bayes' theorem is applied...
متن کاملAnalyzing the concept of sense of place and its effect on the identity of place in new cities
The purpose of current article is to study the effect of sense of place on creating the urban identity in new cities. In order to fulfill the mentioned purpose, the key concepts of research and the theories related to the issue were studied. According to it, in order to collect the data, library method and note taking tool were used. The studies indicated the presence of human in the environmen...
متن کاملWikicorpus: A Word-Sense Disambiguated Multilingual Wikipedia Corpus
This article presents a new freely available trilingual corpus (Catalan, Spanish, English) that contains large portions of the Wikipedia and has been automatically enriched with linguistic information. To our knowledge, this is the largest such corpus that is freely available to the community: In its present version, it contains over 750 million words. The corpora have been annotated with lemma...
متن کامل